Rapid Evaluation of Multiple Density Models

نویسندگان

  • Alexander G. Gray
  • Andrew W. Moore
چکیده

When highly-accurate and/or assumption-free density estimation is needed, nonpara-metric methods are often called upon-most notably the popular kernel density estimation (KDE) method. However, the practitioner is instantly faced with the formidable computational cost of KDE for appreciable dataset sizes, which becomes even more prohibitive when many models with diierent kernel scales (bandwidths) must be evaluated { this is necessary for nding the optimal model, among other reasons. In previous work we presented an algorithm for fast KDE which addresses large dataset sizes and large dimensionalities, but assumes only a single bandwidth. In this paper we present a generalization of that algorithm allowing multiple models with diierent bandwidths to be computed simultaneously, in substantially less time than either running the single-bandwidth algorithm for each model independently , or running the standard exhaustive method. We show examples of computing the likelihood curve for 100,000 data and 100 models ranging across 3 orders of magnitude in scale, in minutes or seconds. 1 KERNEL DENSITY ESTIMATION Density estimation. In situations where the density of a dataset itself (rather than other inferences) is of importance, such as in exploratory scientiic data analysis , nonparametric methods are used because they make minimal or no assumptions about the distribution of the data, while achieving high accuracy { making serious density estimation almost synonymous with nonparametric methods 14, 12]. For example, kernel density estimation (KDE), the most widely used and studied nonparametric density estimation method and thus our focus here, can be shown to converge to the true underlying density with probability 1 as more data are observed, with no distribution assumptions at all, requiring only mild conditions on the kernel function and scale 2]. Highly-accurate density estimation is also of use as a core engine in probabilis-tic learning tasks, from classiication to regression to clustering (though arguably for some problems density estimation itself may be skipped and the problem better solved directly, e.g. in discriminative classiica-tion 16]). Unfortunately, the inherent exibility which yields these beneets comes at a very high computational cost, which is our primary focus in this paper. Kernel density estimation. The task is to estimate the density ^ p(x q) for each point x q in a query (test) dataset X Q (having size N Q), from which we can also compute the overall log-likelihood of the dataset b L Q = P NQ q=1 log ^ p(x q). The 'model' is the training …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficiency evaluation of wheat farming: a network data envelopment analysis approach

Traditional data envelopment analysis (DEA) models deal with measurement of relative efficiency of decision making units (DMUs) in which multiple-inputs consumed to produce multiple-outputs. One of the drawbacks of these models is neglecting internal processes of each system, which may have intermediate products and/or independent inputs and/or outputs. In this paper some methods which are usab...

متن کامل

EVALUATION OF CONCRETE COMPRESSIVE STRENGTH USING ARTIFICIAL NEURAL NETWORK AND MULTIPLE LINEAR REGRESSION MODELS

In the present study, two different data-driven models, artificial neural network (ANN) and multiple linear regression (MLR) models, have been developed to predict the 28 days compressive strength of concrete. Seven different parameters namely 3/4 mm sand, 3/8 mm sand, cement content, gravel, maximums size of aggregate, fineness modulus, and water-cement ratio were considered as input variables...

متن کامل

An extended of multiple criteria data envelopment analysis models for ratio data

One of the problems of the data envelopment analysis traditional models in the multiple form that is the weights corresponding to certain inputs and outputs are considered zero in the calculation of efficiency and this means that not all input and output components are utilized for the evaluation of efficiency, as some are ignored. The above issue causes the efficiency score of the under evalua...

متن کامل

Intelligent Health Evaluation Method of Slewing Bearing Adopting Multiple Types of Signals from Monitoring System

Slewing bearing, which is widely applied in tank, excavator and wind turbine, is a critical component of rotational machine. Standard procedure for bearing life calculation and condition assessment was established in general rolling bearings, nevertheless, relatively less literatures, in regard to the health condition assessment of slewing bearing, were published in past. Real time health condi...

متن کامل

Evaluation of MTT and Trypan Blue assays for radiation-induced cell viability test in HepG2 cells

Background:  Cell viability is an important factor in radiation therapy and thus is a method to quantify the effect of the therapy. Materials and Methods: The viability of human hepatoma (HepG2) cells exposed to radiation was evaluated by both the MTT and Trypan blue assays. The cells were seeded on 96 well-plates at a density of 1 x 104 cells/well, incubated overnight, and irradiated with...

متن کامل

Designing and Creating a Mouse Using Nature-Inspired Shapes

Human beings have always made their tools and instruments they need using patterns in nature. Mimicking nature has become the foundation of a new science called Biomimetics. In the present article, multiple forms and levels in nature were utilized to design and create a mouse. The rivers are a good source for choosing the shape of a mouse with lots of stones abraded through the centuries which ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003